Active site prediction using evolutionary and structural information
نویسندگان
چکیده
MOTIVATION The identification of catalytic residues is a key step in understanding the function of enzymes. While a variety of computational methods have been developed for this task, accuracies have remained fairly low. The best existing method exploits information from sequence and structure to achieve a precision (the fraction of predicted catalytic residues that are catalytic) of 18.5% at a corresponding recall (the fraction of catalytic residues identified) of 57% on a standard benchmark. Here we present a new method, Discern, which provides a significant improvement over the state-of-the-art through the use of statistical techniques to derive a model with a small set of features that are jointly predictive of enzyme active sites. RESULTS In cross-validation experiments on two benchmark datasets from the Catalytic Site Atlas and CATRES resources containing a total of 437 manually curated enzymes spanning 487 SCOP families, Discern increases catalytic site recall between 12% and 20% over methods that combine information from both sequence and structure, and by >or=50% over methods that make use of sequence conservation signal only. Controlled experiments show that Discern's improvement in catalytic residue prediction is derived from the combination of three ingredients: the use of the INTREPID phylogenomic method to extract conservation information; the use of 3D structure data, including features computed for residues that are proximal in the structure; and a statistical regularization procedure to prevent overfitting.
منابع مشابه
Automatic classification of highly related Malate Dehydrogenase and L-Lactate Dehydrogenase based on 3D-pattern of active sites
Accurate protein function prediction is an important subject in bioinformatics, especially wheresequentially and structurally similar proteins have different functions. Malate dehydrogenaseand L-lactate dehydrogenase are two evolutionary related enzymes, which exist in a widevariety of organisms. These enzymes are sequentially and structurally similar and sharecommon active site residues, spati...
متن کاملEstimation of LPC coefficients using Evolutionary Algorithms
The vast use of Linear Prediction Coefficients (LPC) in speech processing systems has intensified the importance of their accurate computation. This paper is concerned with computing LPC coefficients using evolutionary algorithms: Genetic Algorithm (GA), Particle Swarm Optimization (PSO), Dif-ferential Evolution (DE) and Particle Swarm Optimization with Differentially perturbed Velocity (PSO-DV...
متن کاملCombining evolutionary and structural information for local protein structure prediction.
We study the effects of various factors in representing and combining evolutionary and structural information for local protein structural prediction based on fragment selection. We prepare databases of fragments from a set of non-redundant protein domains. For each fragment, evolutionary information is derived from homologous sequences and represented as estimated effective counts and frequenc...
متن کاملA new approach on studying the stability of evolutionary game dynamics for financial systems
Financial market modeling and prediction is a difficult problem and drastic changes of the price causes nonlinear dynamic that makes the price prediction one of the most challenging tasks for economists. Since markets always have been interesting for traders, many traders with various beliefs are highly active in a market. The competition among two agents of traders, namely trend follo...
متن کاملA Study on the Electronic and Structural Properties of C12X8 (X = C, B) and Their Interaction with Glycine with Potentially Drug Delivery Vessels
In this paper, the structural properties of C20 and C12B8 fullerene interacting with glycine based onthree active sites of glycine and one C atom or one B atom in C12B8 were analyzed through thedensity functional theory. It was found out that the binding of glycine to C12B8 generated a complex.Our results were extremely relevant in order to identify the potential applications of functionalizedC...
متن کاملFunctional evolution of PLP-dependent enzymes based on active-site structural similarities.
Families of distantly related proteins typically have very low sequence identity, which hinders evolutionary analysis and functional annotation. Slowly evolving features of proteins, such as an active site, are therefore valuable for annotating putative and distantly related proteins. To date, a complete evolutionary analysis of the functional relationship of an entire enzyme family based on ac...
متن کامل